Research on Job Scheduling Algorithm in Hadoop
نویسندگان
چکیده
On the basis of researching Fair Scheduling Strategy deeply in Hadoop cluster,the Node Health Degree is defined by constructing the relationship function between node load and job fail rate, and a job scheduling algorithm based on Node Health Degree is proposed in this paper. Nodes are grouped, according to Node Health Degree, into three categories in order to assign corresponding job in accordance with load and guarantee resource load balance. By comparing with FIFO and Fair scheduling algorithm, the simulation results show that this algorithm can ensure to reduce job fail rate and improve cluster throughput.
منابع مشابه
A Hadoop Job Scheduling Algorithm Based on Pagerank
Aiming at the problem that the job scheduling algorithm based on the classical model of cloud computing in Hadoop is not high, the new job scheduling algorithm based on PageRank algorithm is proposed, Under the premise of ensuring the user experience, we propose a new job scheduling algorithm named ValidRank, which is based on the combination of hierarchical weight and waiting time. Then for th...
متن کاملJob Attentive Scheduling Algorithm in Hadoop
In recent years cloud services have gained much attention as a result of their availability, scalability, and low cost. One use of these services has been for the execution of scientific workflows as part of Big Data Analytics, which are employed in a diverse range of fields including astronomy, physics, seismology, and bioinformatics. There has been much research on heuristic scheduling algori...
متن کاملThe Improved Job Scheduling Algorithm of Hadoop Platform
[Abstract] This paper discussed some job scheduling algorithms for Hadoop platform, and proposed a jobs scheduling optimization algorithm based on Bayes Classification viewing the shortcoming of those algorithms which are used. The proposed algorithm can be summarized as follows. In the scheduling algorithm based on Bayes Classification, the jobs in job queue will be classified into bad job and...
متن کاملImproved Fair Scheduling Algorithm for Hadoop Clustering SNEHA and SHONEY SEbASTIAN
Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works li...
متن کاملHadoop Scheduling Base On Data Locality
In hadoop, the job scheduling is an independent module, users can design their own job scheduler based on their actual application requirements, thereby meet their specific business needs. Currently, hadoop has three schedulers: FIFO, computing capacity scheduling and fair scheduling policy, all of them are take task allocation strategy that considerate data locality simply. They neither suppor...
متن کامل